Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)

Identifieur interne : 001A00 ( Main/Exploration ); précédent : 001999; suivant : 001A01

A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)

Auteurs : Michael Thierschmann [Allemagne] ; Uwe-Erik Martin [Allemagne]

Source :

RBID : Pascal:02-0512026

Descripteurs français

English descriptors

Abstract

The processing of colored documents with Document Management Systems (DMS) is possible with the modern document scanning systems today. Because of the enormous amount of image data generated scanning a typical A4 document with a 300 dpi resolution, image compression is used. The JPEG compression scheme is widely used for such image data. The lack of image quality caused by necessary lossy compression, can significantly reduce the recognition quality of a subsequent optical character recognition (OCR) process, which is essential to any DMS system. The new standard JPEG2000 (Part 6), a high performance system for compressing and archiving scanned documents, particularly those containing text and image, is overcoming the gap between high compression and legibility of documents suitable to be managed inside DMS systems. The utilization of JPEG2000 (Part 6) results in substantially higher image quality in comparison to standard compression techniques. This high quality is achieved by combining automatic text detection with bitonal compression of text and color/grayscale wavelet compression of images. Since the innovative JPEG2000 (Part 6) compression scheme is a complex image processing system, allocating some computational performance, a scalable software system has been designed to meet the throughput of high-performance document scanners.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)</title>
<author>
<name sortKey="Thierschmann, Michael" sort="Thierschmann, Michael" uniqKey="Thierschmann M" first="Michael" last="Thierschmann">Michael Thierschmann</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Martin, Uwe Erik" sort="Martin, Uwe Erik" uniqKey="Martin U" first="Uwe-Erik" last="Martin">Uwe-Erik Martin</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">02-0512026</idno>
<date when="2002">2002</date>
<idno type="stanalyst">PASCAL 02-0512026 INIST</idno>
<idno type="RBID">Pascal:02-0512026</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000647</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000145</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000638</idno>
<idno type="wicri:doubleKey">1017-2653:2002:Thierschmann M:a:scalable:software</idno>
<idno type="wicri:Area/Main/Merge">001A91</idno>
<idno type="wicri:Area/Main/Curation">001A00</idno>
<idno type="wicri:Area/Main/Exploration">001A00</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)</title>
<author>
<name sortKey="Thierschmann, Michael" sort="Thierschmann, Michael" uniqKey="Thierschmann M" first="Michael" last="Thierschmann">Michael Thierschmann</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Martin, Uwe Erik" sort="Martin, Uwe Erik" uniqKey="Martin U" first="Uwe-Erik" last="Martin">Uwe-Erik Martin</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="2002">2002</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Database management system</term>
<term>Document management</term>
<term>Document processing</term>
<term>Image compression</term>
<term>Image databank</term>
<term>Image resolution</term>
<term>Legibility</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Scanning</term>
<term>Software architecture</term>
<term>System management</term>
<term>System performance</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Traitement document</term>
<term>Performance système</term>
<term>Système gestion base donnée</term>
<term>Banque image</term>
<term>Compression image</term>
<term>Résolution image</term>
<term>Architecture logiciel</term>
<term>Gestion système</term>
<term>Balayage</term>
<term>Gestion document</term>
<term>Lisibilité</term>
<term>JPEG2000</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The processing of colored documents with Document Management Systems (DMS) is possible with the modern document scanning systems today. Because of the enormous amount of image data generated scanning a typical A4 document with a 300 dpi resolution, image compression is used. The JPEG compression scheme is widely used for such image data. The lack of image quality caused by necessary lossy compression, can significantly reduce the recognition quality of a subsequent optical character recognition (OCR) process, which is essential to any DMS system. The new standard JPEG2000 (Part 6), a high performance system for compressing and archiving scanned documents, particularly those containing text and image, is overcoming the gap between high compression and legibility of documents suitable to be managed inside DMS systems. The utilization of JPEG2000 (Part 6) results in substantially higher image quality in comparison to standard compression techniques. This high quality is achieved by combining automatic text detection with bitonal compression of text and color/grayscale wavelet compression of images. Since the innovative JPEG2000 (Part 6) compression scheme is a complex image processing system, allocating some computational performance, a scalable software system has been designed to meet the throughput of high-performance document scanners.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
</country>
<region>
<li>Berlin</li>
</region>
<settlement>
<li>Berlin</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Berlin">
<name sortKey="Thierschmann, Michael" sort="Thierschmann, Michael" uniqKey="Thierschmann M" first="Michael" last="Thierschmann">Michael Thierschmann</name>
</region>
<name sortKey="Martin, Uwe Erik" sort="Martin, Uwe Erik" uniqKey="Martin U" first="Uwe-Erik" last="Martin">Uwe-Erik Martin</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A00 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001A00 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:02-0512026
   |texte=   A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024